查看原文
其他

【他山之石】如何将MATLAB中开发的深度学习应用部署到NVIDIA Jetson Xavier NX

“他山之石,可以攻玉”,站在巨人的肩膀才能看得更高,走得更远。在科研的道路上,更需借助东风才能更快前行。为此,我们特别搜集整理了一些实用的代码链接,数据集,软件,编程技巧等,开辟“他山之石”专栏,助你乘风破浪,一路奋勇向前,敬请关注。

作者:知乎—老朽

地址:https://www.zhihu.com/people/lao-xiu-60

如果要将MATLAB里的一个从摄像头实时获得画面并进行分类的例子跑在NVIDIA Jetson Xavier NX上,我该怎么做?

譬如从Darknet导入的YOLOv3或者YOLOv4:yolov3-yolov4-matlab[1],或者参考文末的脚本。
又或者是简单点的用摄像头做个图像分类的例子:Deployment and Classification of Webcam Images on NVIDIA Jetson TX2 Platform[2]
第二个例子本身并不复杂,大家按MATLAB的文档就可以跑,问题常常出在环境设置上。那咱们今天就重点说一下环境设置。至于第一个例子以及完整流程(以及解决中间可能遇到的问题),如果你遇到困难,欢迎联系老朽(知乎发信息就可以,每天都会看)。

01

桌面电脑软件安装与设置
1. MATLAB R2021a,运行于Windows 10企业版
2. CUDA 11.1
3. cuDNN 8.0.4
4. TensorRT 7.2.2
5. Visual Studio 2017
安装文档参见[3]
设置文档参见:Setting Up the Prerequisite Products[4]
为MATLAB安装附加功能/硬件支持包,下图列出的可以都装上:
必须安装的包括:
1. MATLAB Coder Interface for Deep Learning Libraries
2. GPU Coder Interface for Deep Learning Libraries
3. MATLAB Coder Support Package for NVIDIA Jetson and NVIDIA DRIVE Platforms
墙裂建议安装:
1. Deep Learning Toolbox Model Quantization Library
2. Simulink Coder Support Package for NVIDIA Jetson and NVIDIA DRIVE Platforms
其它如有缺的,后续步骤应该都会自动报错提醒。
附加功能/硬件支持包搞不定的看这里:老朽:玩转MATLAB附加功能/硬件支持包安装[5]


02

NVIDIA Jetson Xavier NX的安装设置
初始步骤
1. 从nvidia下载最新的Jetson Xavier NX Developer Kit SD Card Image[6]
2. 按步骤完成SD卡的写入及安装
3. 链接显示器、键盘、鼠标、网现,上电
4. 开机成功,配置机器名及用户账户,并顺手更新一下系统
5. 通过ssh(如putty或者MobaXterm远程登陆,成功)
给Xavier NX安装必要的库与设置环境变量
参考文档:Install and Setup Prerequisites for NVIDIA Boards[7]
1. 安装必要的库
sudo apt-get install libsdl1.2-dev v4l-utils sox libsox-fmt-all libsox-dev
2. 设置环境变量
修改/etc/environment,如下
PATH="/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"LD_LIBRARY_PATH="/usr/local/cuda/lib64/"

3. OpenCV

新版本的系统都是自带OpenCV 4,MATLAB R2021a是支持的。
但咱们得注意区分opencv3和opencv4的引用,否则可能生成代码后编译会出错。
本文开头给的第二个例子里有脚本区分的。
Deployment and Classification of Webcam Images on NVIDIA Jetson TX2 Platform[8]
但如果你还是想回到opencv 3,也有人把一切都做好了,简单点可以直接拿来用吧,参考这里Build OpenCV 3.4 on NVIDIA Jetson AGX Xavier Developer Kit[9]卸载opencv 4,编译安装opencv 3,耗时较长,需耐心等待:


03

蜜汁冲突
NVIDIA Jetson Xavier NX的系统映像里默认装好了以下软件包:
CUDA Version : 10.2 cuDNN Version : 8.0 TensorRT Version : 7.1 OpenCV Version : 4.1.1
我们在这个系统上,除了指定nvcc的路径和对应的库路径并不需要额外指定tensorrt和cudnn的路径。
但我们在开发机的设置步骤里,环境变量部分有设置NVIDIA_TENSORRT和NVIDIA_CUDNN,测试发现,前者会导致对Jetson部署基于tensorrt的代码失败,所以,得去掉这个环境变量


04

检查环境是否可用了
1. 连接设备
>> hwobj= jetson('jetson-host','user','password')Checking for CUDA availability on the Target...Checking for 'nvcc' in the target system path...Checking for cuDNN library availability on the Target...Checking for TensorRT library availability on the Target...Checking for prerequisite libraries is complete.Gathering hardware details...Checking for third-party library availability on the Target...Gathering hardware details is complete. Board name : NVIDIA Jetson AGX Xavier CUDA Version : 10.2 cuDNN Version : 8.0 TensorRT Version : 7.1 GStreamer Version : 1.14.5 V4L2 Version : 1.14.2-1 SDL Version : 1.2 OpenCV Version : 4.1.1 Available Webcams : Available GPUs : Xavier
hwobj =
jetson - 属性:
DeviceAddress: 'sha-xaviernx' Port: 22 BoardName: 'NVIDIA Jetson AGX Xavier' CUDAVersion: '10.2' cuDNNVersion: '8.0' TensorRTVersion: '7.1' SDLVersion: '1.2' V4L2Version: '1.14.2-1' GStreamerVersion: '1.14.5' OpenCVVersion: '4.1.1' GPUInfo: [1×1 struct] WebcamList: []
2. 检查基于cudnn的代码生成
>> envCfg = coder.gpuEnvConfig('jetson'); envCfg.DeepLibTarget = 'cudnn'; envCfg.DeepCodegen = 1; envCfg.Quiet = 0; envCfg.HardwareObject = hwobj; coder.checkGpuInstall(envCfg)Compatible GPU : PASSED CUDA Environment : PASSED Runtime : PASSED cuFFT : PASSED cuSOLVER : PASSED cuBLAS : PASSED cuDNN Environment : PASSED Deep Learning (cuDNN) Code Generation: PASSED
ans =
包含以下字段的 struct:
gpu: 1 cuda: 1 cudnn: 1 tensorrt: 0 basiccodegen: 0 basiccodeexec: 0 deepcodegen: 1 deepcodeexec: 0 tensorrtdatatype: 0 profiling: 0
3. 检查基于tensorrt的代码生成
>> envCfg = coder.gpuEnvConfig('jetson'); envCfg.DeepLibTarget = 'tensorrt'; envCfg.DeepCodegen = 1; envCfg.Quiet = 0; envCfg.HardwareObject = hwobj; coder.checkGpuInstall(envCfg)Compatible GPU : PASSED CUDA Environment : PASSED Runtime : PASSED cuFFT : PASSED cuSOLVER : PASSED cuBLAS : PASSED cuDNN Environment : PASSED TensorRT Environment : PASSED (Warning: Deep learning code generation has been tested with TensorRT v7.2. The provided TensorRT library v7.1 may not be fully compatible.)Deep Learning (TensorRT) Code Generation: PASSED
ans =
包含以下字段的 struct:
gpu: 1 cuda: 1 cudnn: 1 tensorrt: 1 basiccodegen: 0 basiccodeexec: 0 deepcodegen: 1 deepcodeexec: 0 tensorrtdatatype: 1 profiling: 0
都好了,可以开始跑上面给的或者MATLAB里其它的示例了。
跑起来的界面可能是这样子的(不用手写一行C/C++/CUDA代码):


05

附:yolov3_detection脚本
%% Object Detection Using YOLO v3 608x608function out = yolov3_detection() %% Update buildinfo with the OpenCV library flags. %opencv_link_flags = '`pkg-config --cflags --libs opencv`'; % opencv 3 opencv_link_flags = '`pkg-config --cflags --libs opencv4`'; % opencv 4 coder.updateBuildInfo('addLinkFlags',opencv_link_flags); %coder.inline('never');
% Connect to webcam hwobj = jetson; wcam = webcam(hwobj, 1, '1280x720'); img_w = 1280; img_h = 720; player = imageDisplay(hwobj);
%% orgImg = snapshot(wcam); image(player, orgImg);
%% imgSize = 608; out = zeros([img_h img_w 3], 'uint8');
ratio = min(imgSize/img_w, imgSize/img_h);
% Image height and width after resizing image w = round(img_w * ratio); h = round(img_h * ratio); st_h = round((imgSize - h)/2) + 1; st_w = round((imgSize - w)/2) + 1;
fps = 0; while true orgImg = snapshot(wcam); orgImg = fliplr(orgImg); in = im2single(orgImg); % img = imadjust(img, stretchlim(img,[0.01,0.80])); % img = histeq(img); %Creating background in3 = ones(imgSize, imgSize, 3, 'like', in) * 0.5; in2 = imresize(in, [h, w]); %,'Method','bilinear','AntiAliasing',false); in3(st_h:st_h+h-1, st_w:st_w+w-1, :) = in2;
tic; % Count FPS predictions = yolov3_detect(in3); elapsedTime = toc; fps = .9*fps + .1*(1/elapsedTime);
% post-processing and display the results out = postProcess(predictions, orgImg, w, h); out = insertText(out, [1, 1], sprintf('FPS %2.2f', fps), 'FontSize', 26, 'BoxColor', [0,150,0]); out = imresize(out, [img_h img_w]); image(player, out); endend


06

附:将yolov3_detection生成为基于TensorRT FP16的CUDA代码并部署到NVIDIA Jetson Xavier NX上:
%% connect hardware hwobj = jetson('host-name','user','password');
%% Generate CUDA Code for the Target Using GPU Coder% To generate a CUDA executable that can be deployed on to a NVIDIA % target, create a GPU code configuration object for generating an executable.cfg = coder.gpuConfig('exe');cfg.GenerateReport = true;cfg.Hardware = coder.hardware('NVIDIA Jetson');cfg.DeepLearningConfig = coder.DeepLearningConfig('tensorrt');cfg.DeepLearningConfig.DataType = 'fp16';cfg.GpuConfig.ComputeCapability = '7.0';cfg.Hardware.BuildDir = '~/remoteBuildDir';cfg.GpuConfig.SelectCudaDevice = 0;cfg.GenerateExampleMain = 'GenerateCodeAndCompile';
codegen('-config ',cfg,'yolov3_detection', '-report')
%% Run the Sobel Edge Detection on the Target% Run the generated executable on the target.% pid = hwobj.runApplication('yolov3_detection');
链接:

[1]https://ww2.mathworks.cn/matlabcentral/fileexchange/75305-yolov3-yolov4-matlab?s_tid=srchtitle

[2]https://ww2.mathworks.cn/help/gpucoder/ug/deployment-classification-webcam-images-on-NVIDIA-Jetson-TX2.html

[3]https://ww2.mathworks.cn/help/gpucoder/gs/install-prerequisites.html

[4]https://www.mathworks.com/help/gpucoder/gs/setting-up-the-toolchain.html

[5]https://zhuanlan.zhihu.com/p/350323501

[6]https://developer.nvidia.com/jetson-nx-developer-kit-sd-card-image

[7]https://ww2.mathworks.cn/help/releases/R2021a/supportpkg/nvidia/ug/install-and-setup-prerequisites.html

[8]https://www.mathworks.com/help/gpucoder/ug/deployment-classification-webcam-images-on-NVIDIA-Jetson-TX2.html

[9]https://www.jetsonhacks.com/2018/11/08/build-opencv-3-4-on-nvidia-jetson-agx-xavier-developer-kit/

本文目的在于学术交流,并不代表本公众号赞同其观点或对其内容真实性负责,版权归原作者所有,如有侵权请告知删除。


“他山之石”历史文章


更多他山之石专栏文章,

请点击文章底部“阅读原文”查看



分享、点赞、在看,给个三连击呗!

您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存